54 research outputs found

    Coarse-Graining Auto-Encoders for Molecular Dynamics

    Full text link
    Molecular dynamics simulations provide theoretical insight into the microscopic behavior of materials in condensed phase and, as a predictive tool, enable computational design of new compounds. However, because of the large temporal and spatial scales involved in thermodynamic and kinetic phenomena in materials, atomistic simulations are often computationally unfeasible. Coarse-graining methods allow simulating larger systems, by reducing the dimensionality of the simulation, and propagating longer timesteps, by averaging out fast motions. Coarse-graining involves two coupled learning problems; defining the mapping from an all-atom to a reduced representation, and the parametrization of a Hamiltonian over coarse-grained coordinates. Multiple statistical mechanics approaches have addressed the latter, but the former is generally a hand-tuned process based on chemical intuition. Here we present Autograin, an optimization framework based on auto-encoders to learn both tasks simultaneously. Autograin is trained to learn the optimal mapping between all-atom and reduced representation, using the reconstruction loss to facilitate the learning of coarse-grained variables. In addition, a force-matching method is applied to variationally determine the coarse-grained potential energy function. This procedure is tested on a number of model systems including single-molecule and bulk-phase periodic simulations.Comment: 8 pages, 6 figure

    Simulations with machine learning potentials identify the ion conduction mechanism mediating non-Arrhenius behavior in LGPS

    Full text link
    Li10_{10}Ge(PS6_6)2_2 (LGPS) is a highly concentrated solid electrolyte, in which Coulombic repulsion between neighboring cations is hypothesized as the underlying reason for concerted ion hopping, a mechanism common among superionic conductors such as Li7_7La3_3Zr2_2O12_{12} (LLZO) and Li1.3_{1.3}Al0.3_{0.3}Ti1.7_{1.7}(PO4_4)3_3 (LATP). While first principles simulations using molecular dynamics (MD) provide insight into the Li+^+ transport mechanism, historically, there has been a gap in the temperature ranges studied in simulations and experiments. Here, we used a neural network (NN) potential trained on density functional theory (DFT) simulations, to run up to 40-nanosecond long MD simulations at DFT-like accuracy to characterize the ion conduction mechanisms across a range of temperatures that includes previous simulations and experimental studies. We have confirmed a Li+^+ sublattice phase transition in LGPS around 400 K, below which the \textit{ab}-plane diffusivity DabD^*_{ab} is drastically reduced. Concomitant with the sublattice phase transition near 400 K, there is less cation-cation (cross) correlation, as characterized by Haven ratios closer to 1, and the vibrations in the system are more harmonic at lower temperature. Intuitively, at high temperature, the collection of vibrational modes may be sufficient to drive concerted ion hops. However, near room temperature, the vibrational modes available may be insufficient to overcome electrostatic repulsion, thus resulting in less correlated ion motion and comparatively slower ion conduction. Such phenomena of a sublattice phase transition, below which concerted hopping plays a less significant role, may be extended to other highly concentrated solid electrolytes such as LLZO and LATP

    Learning Pair Potentials using Differentiable Simulations

    Full text link
    Learning pair interactions from experimental or simulation data is of great interest for molecular simulations. We propose a general stochastic method for learning pair interactions from data using differentiable simulations (DiffSim). DiffSim defines a loss function based on structural observables, such as the radial distribution function, through molecular dynamics (MD) simulations. The interaction potentials are then learned directly by stochastic gradient descent, using backpropagation to calculate the gradient of the structural loss metric with respect to the interaction potential through the MD simulation. This gradient-based method is flexible and can be configured to simulate and optimize multiple systems simultaneously. For example, it is possible to simultaneously learn potentials for different temperatures or for different compositions. We demonstrate the approach by recovering simple pair potentials, such as Lennard-Jones systems, from radial distribution functions. We find that DiffSim can be used to probe a wider functional space of pair potentials compared to traditional methods like Iterative Boltzmann Inversion. We show that our methods can be used to simultaneously fit potentials for simulations at different compositions and temperatures to improve the transferability of the learned potentials.Comment: 12 pages, 10 figure

    Chemistry-informed Macromolecule Graph Representation for Similarity Computation and Supervised Learning

    Full text link
    Macromolecules are large, complex molecules composed of covalently bonded monomer units, existing in different stereochemical configurations and topologies. As a result of such chemical diversity, representing, comparing, and learning over macromolecules emerge as critical challenges. To address this, we developed a macromolecule graph representation, with monomers and bonds as nodes and edges, respectively. We captured the inherent chemistry of the macromolecule by using molecular fingerprints for node and edge attributes. For the first time, we demonstrated computation of chemical similarity between 2 macromolecules of varying chemistry and topology, using exact graph edit distances and graph kernels. We also trained graph neural networks for a variety of glycan classification tasks, achieving state-of-the-art results. Our work has two-fold implications - it provides a general framework for representation, comparison, and learning of macromolecules; and enables quantitative chemistry-informed decision-making and iterative design in the macromolecular chemical space.Comment: Main text: 4 pages, 2 figures, 1 table; Appendix: 18 pages, 25 figures, 3 table

    Differentiable sampling of molecular geometries with uncertainty-based adversarial attacks

    Full text link
    Neural network (NN) interatomic potentials provide fast prediction of potential energy surfaces, closely matching the accuracy of the electronic structure methods used to produce the training data. However, NN predictions are only reliable within well-learned training domains, and show volatile behavior when extrapolating. Uncertainty quantification approaches can flag atomic configurations for which prediction confidence is low, but arriving at such uncertain regions requires expensive sampling of the NN phase space, often using atomistic simulations. Here, we exploit automatic differentiation to drive atomistic systems towards high-likelihood, high-uncertainty configurations without the need for molecular dynamics simulations. By performing adversarial attacks on an uncertainty metric, informative geometries that expand the training domain of NNs are sampled. When combined to an active learning loop, this approach bootstraps and improves NN potentials while decreasing the number of calls to the ground truth method. This efficiency is demonstrated on sampling of kinetic barriers and collective variables in molecules, and can be extended to any NN potential architecture and materials system.Comment: 12 pages, 4 figures, supporting informatio

    Photocell optimization using dark state protection

    Get PDF
    This work was supported by the Leverhulme Trust (RPG-080). EMG is supported by the Royal Society of Edinburgh/Scottish Government. RGB thanks Samsung Advanced Institute of Technology for funding. AF thanks the Anglo-Israeli association and the Anglo-Jewish association for funding.Conventional photocells suffer a fundamental efficiency threshold imposed by the principle of detailed balance, reflecting the fact that good absorbers must necessarily also be fast emitters. This limitation can be overcome by "parking" the energy of an absorbed photon in a dark state which neither absorbs nor emits light. Here we argue that suitable dark states occur naturally as a consequence of the dipole-dipole interaction between two proximal optical dipoles for a wide range of realistic molecular dimers. We develop an intuitive model of a photocell comprising two light-absorbing molecules coupled to an idealized reaction centre, showing asymmetric dimers are capable of providing a significant enhancement of light-to-current conversion under ambient conditions. We conclude by describing a road map for identifying suitable molecular dimers for demonstrating this effect by screening a very large set of possible candidate molecules.PostprintPeer reviewe

    Automated patent extraction powers generative modeling in focused chemical spaces

    Full text link
    Deep generative models have emerged as an exciting avenue for inverse molecular design, with progress coming from the interplay between training algorithms and molecular representations. One of the key challenges in their applicability to materials science and chemistry has been the lack of access to sizeable training datasets with property labels. Published patents contain the first disclosure of new materials prior to their publication in journals, and are a vast source of scientific knowledge that has remained relatively untapped in the field of data-driven molecular design. Because patents are filed seeking to protect specific uses, molecules in patents can be considered to be weakly labeled into application classes. Furthermore, patents published by the US Patent and Trademark Office (USPTO) are downloadable and have machine-readable text and molecular structures. In this work, we train domain-specific generative models using patent data sources by developing an automated pipeline to go from USPTO patent digital files to the generation of novel candidates with minimal human intervention. We test the approach on two in-class extracted datasets, one in organic electronics and another in tyrosine kinase inhibitors. We then evaluate the ability of generative models trained on these in-class datasets on two categories of tasks (distribution learning and property optimization), identify strengths and limitations, and suggest possible explanations and remedies that could be used to overcome these in practice

    From free-energy profiles to activation free energies

    Get PDF
    Given a chemical reaction going from reactant (R) to the product (P) on a potential energy surface (PES) and a collective variable (CV) discriminating between R and P, we define the free-energy profile (FEP) as the logarithm of the marginal Boltzmann distribution of the CV. This FEP is not a true free energy. Nevertheless, it is common to treat the FEP as the “free-energy” analog of the minimum potential energy path and to take the activation free energy, ΔF‡ RP, as the difference between the maximum at the transition state and the minimum at R. We show that this approximation can result in large errors. The FEP depends on the CV and is, therefore, not unique. For the same reaction, different discriminating CVs can yield different ΔF‡ RP. We derive an exact expression for the activation free energy that avoids this ambiguity. We find ΔF‡ RP to be a combination of the probability of the system being in the reactant state, the probability density on the dividing surface, and the thermal de Broglie wavelength associated with the transition. We apply our formalism to simple analytic models and realistic chemical systems and show that the FEP-based approximation applies only at low temperatures for CVs with a small effective mass. Most chemical reactions occur on complex, high-dimensional PES that cannot be treated analytically and pose the added challenge of choosing a good CV. We study the influence of that choice and find that, while the reaction free energy is largely unaffected, ΔF‡ RP is quite sensitive
    corecore